pacman::p_load(tmap, sf, sp, DT, stplanr, performance, reshape2, ggpubr, units, tidyverse)Spatial Interactions
1.1 Overview
Spatial interaction represent the dynamic flow or movementmof people, material, or information between locations in geographical space. Each spatial interaction is composed of a discrete origin and destination (OD) pair and represented as a spatial interaction matrix of centroids from origin and destination. The connection between origin and their destination can be visualized by desire lines, typically straight lines that records the the number of people travelling between locations.
Learning Objectives 1. Build a spatial interaction matrix 2. Construct desire lines in geospatial data
1.2 Load packages
sf performs geospatial data import, integration, processing and transformation. DT enables R data objects (matrices or data frames) to be displayed as tables on HTML pages. tidyverse performs data import, integration, wrangling and visualisation. tmap creates thematic maps. stplanranalyses OD matrix.
1.3 Import data
odbus contains the number of trips by weekdays and weekends from origin to destination bus stops. This is a September 2023 dataset in Passenger Volume by Origin Bus Stop from LTA Datamall and reflects the passenger trip traffic.
Show the code
odbus <- read_csv("data/aspatial/origin_destination_bus_202309.csv")
glimpse(odbus)Rows: 5,714,196
Columns: 7
$ YEAR_MONTH <chr> "2023-09", "2023-09", "2023-09", "2023-09", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKENDS/HOLIDAY", "WEEKDAY",…
$ TIME_PER_HOUR <dbl> 17, 10, 10, 7, 7, 11, 16, 16, 16, 20, 7, 11, 11, 1…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <chr> "24499", "65239", "65239", "23519", "23519", "5250…
$ DESTINATION_PT_CODE <chr> "22221", "65159", "65159", "23311", "23311", "4204…
$ TOTAL_TRIPS <dbl> 1, 9, 2, 6, 1, 2, 18, 3, 2, 1, 2, 5, 3, 5, 5, 19, …
The output shows that odbus is an aspatial data containing 5,714,196 records and 7 fields. Origin and destination codes will be converted to from character to factor.
odbus <- odbus %>%
mutate(ORIGIN_PT_CODE = as.factor(ORIGIN_PT_CODE),
DESTINATION_PT_CODE = as.factor(DESTINATION_PT_CODE))busstops is a geospatial dataset that contains the detailed information for all bus stops currently serviced by buses, including bus stop code, road name, description, location coordinates. The output indicates that the geospatial objects are point features. There are 5161 features and 3 fields. It is in SVY21 projected coordinates system with XY dimension.
Show the code
busstop <- st_read(dsn = "data/geospatial/BusStopLocation_Jul2023", layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Hands_on_Ex/Hands_on_Ex03/data/geospatial/BusStopLocation_Jul2023'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
Show the code
glimpse(busstop)Rows: 5,161
Columns: 4
$ BUS_STOP_N <chr> "22069", "32071", "44331", "96081", "11561", "66191", "2338…
$ BUS_ROOF_N <chr> "B06", "B23", "B01", "B05", "B05", "B03", "B02A", "B02", "B…
$ LOC_DESC <chr> "OPP CEVA LOGISTICS", "AFT TRACK 13", "BLK 239", "GRACE IND…
$ geometry <POINT [m]> POINT (13576.31 32883.65), POINT (13228.59 44206.38),…
mpsz is a geospatial dataset from the Master Plan 2019, a forward looking guiding plan for Singapore’s development in the medium term over the next 10 to 15 years published in 2019. Note this mpsz differs from that in previous chapter, Data Wrangling.
The output indicates that the geospatial objects are multipolygon features. There are 332 features and 6 fields. It is in WGS84 projected coordinates system with XY dimension.
Show the code
mpsz = st_read(dsn="data/geospatial/MPSZ-2019", layer="MPSZ-2019") %>%
st_transform(crs=3414)Reading layer `MPSZ-2019' from data source
`/Users/chockwankee/Documents/chockwk/ISSS624_Geospatial_Analytics/Hands_on_Ex/Hands_on_Ex03/data/geospatial/MPSZ-2019'
using driver `ESRI Shapefile'
Simple feature collection with 332 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
Show the code
glimpse(mpsz)Rows: 332
Columns: 7
$ SUBZONE_N <chr> "MARINA EAST", "INSTITUTION HILL", "ROBERTSON QUAY", "JURON…
$ SUBZONE_C <chr> "MESZ01", "RVSZ05", "SRSZ01", "WISZ01", "MUSZ02", "MPSZ05",…
$ PLN_AREA_N <chr> "MARINA EAST", "RIVER VALLEY", "SINGAPORE RIVER", "WESTERN …
$ PLN_AREA_C <chr> "ME", "RV", "SR", "WI", "MU", "MP", "WI", "WI", "SI", "SI",…
$ REGION_N <chr> "CENTRAL REGION", "CENTRAL REGION", "CENTRAL REGION", "WEST…
$ REGION_C <chr> "CR", "CR", "CR", "WR", "CR", "CR", "WR", "WR", "CR", "CR",…
$ geometry <MULTIPOLYGON [m]> MULTIPOLYGON (((33222.98 29..., MULTIPOLYGON (…
1.4 Prepare data
We will focus on weekday afternoon peak hours to capture high movements and interactions.
::: panel-tabset
Step 1
Commuting flows on weekday between 5pm to 8pm are extracted and there are 226,658 records during weekday afternoon peak hours based on bus stop code. The data table can be viewed using datatable() and saved to rds using write_rds().
Show the code
odbus5_8 <- odbus %>%
filter(DAY_TYPE == "WEEKDAY") %>%
filter(TIME_PER_HOUR >= 5 & TIME_PER_HOUR <= 8) %>%
group_by(ORIGIN_PT_CODE, DESTINATION_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
datatable(odbus5_8)# save and read
write_rds(odbus5_8, "rds/odbus5_8.rds")
odbus5_8 <- read_rds("rds/odbus5_8.rds")Step 2
To integrate the planning subzone codes SUBZONE_C of mpsz sf data frame into busstop sf data frame, st_intersection() is used to perform point and polygon overly and the output will be in point sf object.select() of dplyr package is then use to retain only BUS_STOP_N and SUBZONE_C in the busstop_mpsz sf data frame. There are 5,156 bus stops in the subzones.
busstop_mpsz <- st_intersection(busstop, mpsz) %>%
select(BUS_STOP_N, SUBZONE_C) %>%
st_drop_geometry()
datatable(busstop_mpsz)Step 3
Append the planning subzone code from busstop_mpsz data frame onto odbus5_8 data frame.
od_data <- left_join(odbus5_8, busstop_mpsz,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE,
ORIGIN_SZ = SUBZONE_C,
DESTIN_BS = DESTINATION_PT_CODE)
glimpse(od_data)Rows: 224,042
Columns: 4
Groups: ORIGIN_BS [5,006]
$ ORIGIN_BS <chr> "01012", "01012", "01012", "01012", "01012", "01012", "01012…
$ DESTIN_BS <fct> 01112, 01113, 01121, 01211, 01311, 01621, 07371, 60011, 6002…
$ TRIPS <dbl> 146, 91, 57, 88, 137, 2, 7, 27, 14, 25, 18, 7, 1, 2, 1, 12, …
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
There are 224,042 commuting flows with origin and destinations.
Step 4
Check for duplicate records.
duplicate <- od_data %>%
group_by_all() %>%
filter(n() > 1) %>%
ungroup()
duplicate# A tibble: 1,084 × 4
ORIGIN_BS DESTIN_BS TRIPS ORIGIN_SZ
<chr> <fct> <dbl> <chr>
1 11009 01411 4 QTSZ01
2 11009 01411 4 QTSZ01
3 11009 01421 18 QTSZ01
4 11009 01421 18 QTSZ01
5 11009 01511 10 QTSZ01
6 11009 01511 10 QTSZ01
7 11009 01521 3 QTSZ01
8 11009 01521 3 QTSZ01
9 11009 01611 3 QTSZ01
10 11009 01611 3 QTSZ01
# … with 1,074 more rows
The output shows 1,084 duplicated rows with same values across 4 fields. Duplicates are removed by unique().
od_data <- unique(od_data)
glimpse(od_data)Rows: 223,500
Columns: 4
Groups: ORIGIN_BS [5,006]
$ ORIGIN_BS <chr> "01012", "01012", "01012", "01012", "01012", "01012", "01012…
$ DESTIN_BS <fct> 01112, 01113, 01121, 01211, 01311, 01621, 07371, 60011, 6002…
$ TRIPS <dbl> 146, 91, 57, 88, 137, 2, 7, 27, 14, 25, 18, 7, 1, 2, 1, 12, …
$ ORIGIN_SZ <chr> "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", "RCSZ10", …
From previous 224,042 records, 542 duplicates are removed and 223,500 unique records remain in od_data.
Step 5
Append planning subzone codes to od_data.
od_data <- left_join(od_data , busstop_mpsz,
by = c("DESTIN_BS" = "BUS_STOP_N"))
od_data <- od_data %>%
rename(DESTIN_SZ = SUBZONE_C) %>%
drop_na() %>%
group_by(ORIGIN_SZ, DESTIN_SZ) %>%
summarise(MORNING_PEAK = sum(TRIPS)) %>%
ungroup()
glimpse(od_data)Rows: 20,597
Columns: 3
$ ORIGIN_SZ <chr> "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01", "AMSZ01…
$ DESTIN_SZ <chr> "AMSZ01", "AMSZ02", "AMSZ03", "AMSZ04", "AMSZ05", "AMSZ06…
$ MORNING_PEAK <dbl> 1643, 7435, 10995, 1913, 5455, 1546, 1169, 2138, 1319, 11…
Summing the trips by origin and destination, there are 20,597 OD flows.
#save and read
write_rds(od_data, "rds/od_data.rds")
od_data <- read_rds("rds/od_data.rds")1.5 Plot data
To ensure that spatial flows are between different subzones, the intra-zonal flows within the same subzones will be removed. prepare a desire line by using stplanr package
od_data1 <- od_data[od_data$ORIGIN_SZ!=od_data$DESTIN_SZ,]Create desire lines
od2line() takes a data frame and a spatial object as inputs and outputs geographic lines representing movement between origins and destinations.
flowline <- od2line(flow = od_data1,
zones = mpsz,
zone_code = "SUBZONE_C")tmap_options(check.and.fix = TRUE)
tm_shape(mpsz) + tm_polygons() + flowline %>%
tm_shape() + tm_lines(lwd = "MORNING_PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.9)
To visualize OD flows above 10000,
tm_shape(mpsz) + tm_polygons() + flowline %>%
filter(MORNING_PEAK >= 10000) %>%
tm_shape() + tm_lines(lwd = "MORNING_PEAK",
style = "quantile",
scale = c(0.1, 1, 3, 5, 7, 10),
n = 6,
alpha = 0.9)